Mixed Precision Dense Linear System Solvers for High Performance Reconfigurable Computing

نویسندگان

  • JunKyu Lee
  • Gregory D. Peterson
  • Robert J. Harrison
  • Robert J. Hinde
چکیده

The iterative refinement method for linear system solvers can improve performance while maintaining numeric accuracy. Previous work addressing iterative refinement exploits single precision and double precision for CPU, GPU, or Cell/BE processors. Due to only two different precisions supported, iterative refinement is limited on those platforms. Reconfigurable Computing (RC) is a great candidate to exploit iterative refinement since it is able to employ any precision computation as long as the hardware resources are sufficient. In iterative refinement for RC, the choice of working precision for Gaussian elimination is extremely important since its computational complexity is O(n) while the other steps are O(n). In this paper, we explore RC architecture and working precision for Gaussian elimination to obtain both high performance and satisfactory numerical solutions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mixed Precision Comparison in Reconfigurable Systems

Customisable data formats provide an opportunity for exploring trade-offs in accuracy and performance of reconfigurable systems. This paper introduces a novel methodology for mixed-precision comparison, which improves comparison performance by using reduced-precision datapaths while maintaining accuracy by using high-precision datapaths. Our methodology adopts reduced-precision data-paths for p...

متن کامل

Efficient Parallel Solvers for Large Dense Systems of Linear Interval Equations

Verified solvers for dense linear (interval-)systems require a lot of resources, both in terms of computing power and memory usage. Computing a verified solution of large dense linear systems (dimension n > 10000) on a single machine quickly approaches the limits of today’s hardware. Therefore, an efficient parallel verified solver for distributed memory systems is needed. In this work we prese...

متن کامل

High Performance Computing Benchmark Tool for Parallel Processing of Large Models

Benchmarks for parallel processing of large models is an urgent need for High Performance Computing (HPC) as today’s model size reaches millions of degrees of freedom. Explicit solvers as in the case of crash dynamics or fluid dynamics do not require matrix based equation solvers and inherently exhibit good scalability on large numbers of processors. Where as analysis requiring implicit solvers...

متن کامل

Breakthroughs in Sparse Solvers for GPUs

The CUDA Center of Excellence (CCOE) at UTK targets the development of innovative algorithms and technologies to tackle challenges in Heterogeneous High Performance Computing. Over the last year, the CCOE at UTK developed CUDA-based breakthrough technologies in sparse solvers for GPUs. Here, we describe the main ones – a sparse iterative solvers package, a communication-avoiding (CA) sparse ite...

متن کامل

Energy Footprint of Advanced Dense Numerical Linear Algebra using Tile Algorithms on Multicore Architecture

We propose to study the impact on the energy footprint of two advanced algorithmic strategies in the context of high performance dense linear algebra libraries: (1) mixed precision algorithms with iterative refinement allow to run at the peak performance of single precision floating-point arithmetic while achieving double precision accuracy and (2) tree reduction technique exposes more parallel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009